Search CORE

19 research outputs found

An entropic generalization of Caffarelli’s contraction theorem via covariance inequalities

Author: Aram-Alexandre Pooladian
Sinho Chewi
Publication venue
Publication date: 09/03/2022
Field of study

The optimal transport map between the standard Gaussian measure and an

\alpha

-strongly log-concave probability measure is

\alpha^{-1/2}

-Lipschitz, as first observed in a celebrated theorem of Caffarelli. In this paper, we apply two classical covariance inequalities (the Brascamp-Lieb and Cram\'er-Rao inequalities) to prove a sharp bound on the Lipschitz constant of the map that arises from entropically regularized optimal transport. In the limit as the regularization tends to zero, we obtain an elegant and short proof of Caffarelli's original result. We also extend Caffarelli's theorem to the setting in which the Hessians of the log-densities of the measures are bounded by arbitrary positive definite commuting matrices

arXiv.org e-Print Archive

Comptes Rendus Mathématique

Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

Author: Altschuler Jason M.
Chewi Sinho
Gerber Patrik
Stromme Austin J.
Publication venue
Publication date: 15/06/2021
Field of study

We study first-order optimization algorithms for computing the barycenter of Gaussian distributions with respect to the optimal transport metric. Although the objective is geodesically non-convex, Riemannian GD empirically converges rapidly, in fact faster than off-the-shelf methods such as Euclidean GD and SDP solvers. This stands in stark contrast to the best-known theoretical results for Riemannian GD, which depend exponentially on the dimension. In this work, we prove new geodesic convexity results which provide stronger control of the iterates, yielding a dimension-free convergence rate. Our techniques also enable the analysis of two related notions of averaging, the entropically-regularized barycenter and the geometric median, providing the first convergence guarantees for Riemannian GD for these problems.Comment: 48 pages, 8 figure

arXiv.org e-Print Archive

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

Author: Chen Sitan
Chewi Sinho
Li Jerry
Li Yuanzhi
Salim Adil
Zhang Anru R.
Publication venue
Publication date: 04/10/2022
Field of study

We provide theoretical convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL

\cdot

E 2. Our main result is that, assuming accurate score estimates, such SGMs can efficiently sample from essentially any realistic data distribution. In contrast to prior works, our results (1) hold for an

L^2

-accurate score estimate (rather than

L^\infty

-accurate); (2) do not require restrictive functional inequality conditions that preclude substantial non-log-concavity; (3) scale polynomially in all relevant problem parameters; and (4) match state-of-the-art complexity guarantees for discretization of the Langevin diffusion, provided that the score error is sufficiently small. We view this as strong theoretical justification for the empirical success of SGMs. We also examine SGMs based on the critically damped Langevin diffusion (CLD). Contrary to conventional wisdom, we provide evidence that the use of the CLD does not reduce the complexity of SGMs.Comment: 30 page

arXiv.org e-Print Archive

Query lower bounds for log-concave sampling

Author: Chewi Sinho
Li Jerry
Lu Chen
Narayanan Shyam
Pont Jaume de Dios
Publication venue
Publication date: 05/04/2023
Field of study

Log-concave sampling has witnessed remarkable algorithmic advances in recent years, but the corresponding problem of proving lower bounds for this task has remained elusive, with lower bounds previously known only in dimension one. In this work, we establish the following query lower bounds: (1) sampling from strongly log-concave and log-smooth distributions in dimension

d\ge 2

requires

\Omega(\log \kappa)

queries, which is sharp in any constant dimension, and (2) sampling from Gaussians in dimension

d

(hence also from general log-concave and log-smooth distributions in dimension

d

) requires

\widetilde \Omega(\min(\sqrt\kappa \log d, d))

queries, which is nearly sharp for the class of Gaussians. Here

\kappa

denotes the condition number of the target distribution. Our proofs rely upon (1) a multiscale construction inspired by work on the Kakeya conjecture in harmonic analysis, and (2) a novel reduction that demonstrates that block Krylov algorithms are optimal for this problem, as well as connections to lower bound techniques based on Wishart matrices developed in the matrix-vector query literature.Comment: 46 pages, 2 figure

arXiv.org e-Print Archive